Overview

Dataset statistics

Number of variables16
Number of observations8693
Missing cells1400
Missing cells (%)1.0%
Duplicate rows13
Duplicate rows (%)0.1%
Total size in memory1.3 MiB
Average record size in memory159.4 B

Variable types

Categorical4
Boolean3
Numeric9

Alerts

Dataset has 13 (0.1%) duplicate rowsDuplicates
VIP is highly imbalanced (84.0%)Imbalance
HomePlanet has 201 (2.3%) missing valuesMissing
CryoSleep has 217 (2.5%) missing valuesMissing
Destination has 182 (2.1%) missing valuesMissing
VIP has 203 (2.3%) missing valuesMissing
Cabin_Deck has 199 (2.3%) missing valuesMissing
Cabin_Number has 199 (2.3%) missing valuesMissing
Cabin_Side has 199 (2.3%) missing valuesMissing
Age has 178 (2.0%) zerosZeros
RoomService has 5758 (66.2%) zerosZeros
FoodCourt has 5639 (64.9%) zerosZeros
ShoppingMall has 5795 (66.7%) zerosZeros
Spa has 5507 (63.3%) zerosZeros
VRDeck has 5683 (65.4%) zerosZeros
Total_Spend has 3653 (42.0%) zerosZeros

Reproduction

Analysis started2024-03-05 09:28:09.778264
Analysis finished2024-03-05 09:33:43.074418
Duration5 minutes and 33.3 seconds
Software versionydata-profiling vv4.6.5
Download configurationconfig.json

Variables

HomePlanet
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing201
Missing (%)2.3%
Memory size393.9 KiB
Earth
4602 
Europa
2131 
Mars
1759 

Length

Max length6
Median length5
Mean length5.0438059
Min length4

Characters and Unicode

Total characters42832
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEuropa
2nd rowEarth
3rd rowEuropa
4th rowEuropa
5th rowEarth

Common Values

ValueCountFrequency (%)
Earth 4602
52.9%
Europa 2131
24.5%
Mars 1759
 
20.2%
(Missing) 201
 
2.3%

Length

2024-03-05T16:33:43.129370image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-05T16:33:43.216229image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
earth 4602
54.2%
europa 2131
25.1%
mars 1759
 
20.7%

Most occurring characters

ValueCountFrequency (%)
a 8492
19.8%
r 8492
19.8%
E 6733
15.7%
t 4602
10.7%
h 4602
10.7%
u 2131
 
5.0%
o 2131
 
5.0%
p 2131
 
5.0%
M 1759
 
4.1%
s 1759
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 34340
80.2%
Uppercase Letter 8492
 
19.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8492
24.7%
r 8492
24.7%
t 4602
13.4%
h 4602
13.4%
u 2131
 
6.2%
o 2131
 
6.2%
p 2131
 
6.2%
s 1759
 
5.1%
Uppercase Letter
ValueCountFrequency (%)
E 6733
79.3%
M 1759
 
20.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 42832
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8492
19.8%
r 8492
19.8%
E 6733
15.7%
t 4602
10.7%
h 4602
10.7%
u 2131
 
5.0%
o 2131
 
5.0%
p 2131
 
5.0%
M 1759
 
4.1%
s 1759
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42832
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8492
19.8%
r 8492
19.8%
E 6733
15.7%
t 4602
10.7%
h 4602
10.7%
u 2131
 
5.0%
o 2131
 
5.0%
p 2131
 
5.0%
M 1759
 
4.1%
s 1759
 
4.1%

CryoSleep
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing217
Missing (%)2.5%
Memory size393.9 KiB
False
5439 
True
3037 
(Missing)
 
217
ValueCountFrequency (%)
False 5439
62.6%
True 3037
34.9%
(Missing) 217
 
2.5%
2024-03-05T16:33:43.278119image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Destination
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing182
Missing (%)2.1%
Memory size393.9 KiB
TRAPPIST-1e
5915 
55 Cancri e
1800 
PSO J318.5-22
796 

Length

Max length13
Median length11
Mean length11.187052
Min length11

Characters and Unicode

Total characters95213
Distinct characters23
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTRAPPIST-1e
2nd rowTRAPPIST-1e
3rd rowTRAPPIST-1e
4th rowTRAPPIST-1e
5th rowTRAPPIST-1e

Common Values

ValueCountFrequency (%)
TRAPPIST-1e 5915
68.0%
55 Cancri e 1800
 
20.7%
PSO J318.5-22 796
 
9.2%
(Missing) 182
 
2.1%

Length

2024-03-05T16:33:43.328950image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-05T16:33:43.401205image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
trappist-1e 5915
45.8%
55 1800
 
13.9%
cancri 1800
 
13.9%
e 1800
 
13.9%
pso 796
 
6.2%
j318.5-22 796
 
6.2%

Most occurring characters

ValueCountFrequency (%)
P 12626
13.3%
T 11830
12.4%
e 7715
 
8.1%
S 6711
 
7.0%
- 6711
 
7.0%
1 6711
 
7.0%
A 5915
 
6.2%
I 5915
 
6.2%
R 5915
 
6.2%
5 4396
 
4.6%
Other values (13) 20768
21.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 52304
54.9%
Lowercase Letter 16715
 
17.6%
Decimal Number 14291
 
15.0%
Dash Punctuation 6711
 
7.0%
Space Separator 4396
 
4.6%
Other Punctuation 796
 
0.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 12626
24.1%
T 11830
22.6%
S 6711
12.8%
A 5915
11.3%
I 5915
11.3%
R 5915
11.3%
C 1800
 
3.4%
O 796
 
1.5%
J 796
 
1.5%
Lowercase Letter
ValueCountFrequency (%)
e 7715
46.2%
c 1800
 
10.8%
i 1800
 
10.8%
r 1800
 
10.8%
n 1800
 
10.8%
a 1800
 
10.8%
Decimal Number
ValueCountFrequency (%)
1 6711
47.0%
5 4396
30.8%
2 1592
 
11.1%
3 796
 
5.6%
8 796
 
5.6%
Dash Punctuation
ValueCountFrequency (%)
- 6711
100.0%
Space Separator
ValueCountFrequency (%)
4396
100.0%
Other Punctuation
ValueCountFrequency (%)
. 796
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 69019
72.5%
Common 26194
 
27.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 12626
18.3%
T 11830
17.1%
e 7715
11.2%
S 6711
9.7%
A 5915
8.6%
I 5915
8.6%
R 5915
8.6%
c 1800
 
2.6%
i 1800
 
2.6%
r 1800
 
2.6%
Other values (5) 6992
10.1%
Common
ValueCountFrequency (%)
- 6711
25.6%
1 6711
25.6%
5 4396
16.8%
4396
16.8%
2 1592
 
6.1%
3 796
 
3.0%
8 796
 
3.0%
. 796
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 95213
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 12626
13.3%
T 11830
12.4%
e 7715
 
8.1%
S 6711
 
7.0%
- 6711
 
7.0%
1 6711
 
7.0%
A 5915
 
6.2%
I 5915
 
6.2%
R 5915
 
6.2%
5 4396
 
4.6%
Other values (13) 20768
21.8%

Age
Real number (ℝ)

ZEROS 

Distinct81
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.82793
Minimum0
Maximum79
Zeros178
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size393.9 KiB
2024-03-05T16:33:43.474561image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q120
median27
Q337
95-th percentile55
Maximum79
Range79
Interquartile range (IQR)17

Descriptive statistics

Standard deviation14.339054
Coefficient of variation (CV)0.49740145
Kurtosis0.16715426
Mean28.82793
Median Absolute Deviation (MAD)9
Skewness0.42347771
Sum250601.2
Variance205.60848
MonotonicityNot monotonic
2024-03-05T16:33:43.552904image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24 324
 
3.7%
18 320
 
3.7%
21 311
 
3.6%
19 293
 
3.4%
23 292
 
3.4%
22 291
 
3.3%
20 277
 
3.2%
26 268
 
3.1%
28 267
 
3.1%
27 259
 
3.0%
Other values (71) 5791
66.6%
ValueCountFrequency (%)
0 178
2.0%
1 67
 
0.8%
2 75
0.9%
3 75
0.9%
4 71
 
0.8%
5 33
 
0.4%
6 40
 
0.5%
7 52
 
0.6%
8 46
 
0.5%
9 42
 
0.5%
ValueCountFrequency (%)
79 3
 
< 0.1%
78 3
 
< 0.1%
77 2
 
< 0.1%
76 2
 
< 0.1%
75 4
< 0.1%
74 5
0.1%
73 7
0.1%
72 4
< 0.1%
71 7
0.1%
70 9
0.1%

VIP
Boolean

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing203
Missing (%)2.3%
Memory size393.9 KiB
False
8291 
True
 
199
(Missing)
 
203
ValueCountFrequency (%)
False 8291
95.4%
True 199
 
2.3%
(Missing) 203
 
2.3%
2024-03-05T16:33:43.622855image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

RoomService
Real number (ℝ)

ZEROS 

Distinct1273
Distinct (%)14.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean220.00932
Minimum0
Maximum14327
Zeros5758
Zeros (%)66.2%
Negative0
Negative (%)0.0%
Memory size393.9 KiB
2024-03-05T16:33:43.680215image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q341
95-th percentile1256.8
Maximum14327
Range14327
Interquartile range (IQR)41

Descriptive statistics

Standard deviation660.51905
Coefficient of variation (CV)3.0022322
Kurtosis66.577452
Mean220.00932
Median Absolute Deviation (MAD)0
Skewness6.3977659
Sum1912541
Variance436285.42
MonotonicityNot monotonic
2024-03-05T16:33:43.772290image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5758
66.2%
1 117
 
1.3%
2 79
 
0.9%
3 61
 
0.7%
4 47
 
0.5%
5 28
 
0.3%
9 25
 
0.3%
8 24
 
0.3%
6 24
 
0.3%
14 21
 
0.2%
Other values (1263) 2509
28.9%
ValueCountFrequency (%)
0 5758
66.2%
1 117
 
1.3%
2 79
 
0.9%
3 61
 
0.7%
4 47
 
0.5%
5 28
 
0.3%
6 24
 
0.3%
7 17
 
0.2%
8 24
 
0.3%
9 25
 
0.3%
ValueCountFrequency (%)
14327 1
< 0.1%
9920 1
< 0.1%
8586 1
< 0.1%
8243 1
< 0.1%
8209 1
< 0.1%
8168 1
< 0.1%
8151 1
< 0.1%
8142 1
< 0.1%
8030 1
< 0.1%
7406 1
< 0.1%

FoodCourt
Real number (ℝ)

ZEROS 

Distinct1507
Distinct (%)17.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean448.43403
Minimum0
Maximum29813
Zeros5639
Zeros (%)64.9%
Negative0
Negative (%)0.0%
Memory size393.9 KiB
2024-03-05T16:33:43.848343image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q361
95-th percentile2669.4
Maximum29813
Range29813
Interquartile range (IQR)61

Descriptive statistics

Standard deviation1595.7906
Coefficient of variation (CV)3.5585851
Kurtosis74.856189
Mean448.43403
Median Absolute Deviation (MAD)0
Skewness7.1775152
Sum3898237
Variance2546547.7
MonotonicityNot monotonic
2024-03-05T16:33:43.946588image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5639
64.9%
1 116
 
1.3%
2 75
 
0.9%
3 53
 
0.6%
4 53
 
0.6%
5 33
 
0.4%
6 31
 
0.4%
9 28
 
0.3%
7 27
 
0.3%
10 27
 
0.3%
Other values (1497) 2611
30.0%
ValueCountFrequency (%)
0 5639
64.9%
1 116
 
1.3%
2 75
 
0.9%
3 53
 
0.6%
4 53
 
0.6%
5 33
 
0.4%
6 31
 
0.4%
7 27
 
0.3%
8 20
 
0.2%
9 28
 
0.3%
ValueCountFrequency (%)
29813 1
< 0.1%
27723 1
< 0.1%
27071 1
< 0.1%
26830 1
< 0.1%
21066 1
< 0.1%
18481 1
< 0.1%
17958 1
< 0.1%
17901 1
< 0.1%
17687 1
< 0.1%
17432 1
< 0.1%

ShoppingMall
Real number (ℝ)

ZEROS 

Distinct1115
Distinct (%)12.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean169.5723
Minimum0
Maximum23492
Zeros5795
Zeros (%)66.7%
Negative0
Negative (%)0.0%
Memory size393.9 KiB
2024-03-05T16:33:44.013210image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q322
95-th percentile912.4
Maximum23492
Range23492
Interquartile range (IQR)22

Descriptive statistics

Standard deviation598.00716
Coefficient of variation (CV)3.5265616
Kurtosis336.01735
Mean169.5723
Median Absolute Deviation (MAD)0
Skewness12.763842
Sum1474092
Variance357612.57
MonotonicityNot monotonic
2024-03-05T16:33:44.098233image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5795
66.7%
1 153
 
1.8%
2 80
 
0.9%
3 59
 
0.7%
4 45
 
0.5%
5 38
 
0.4%
7 36
 
0.4%
6 34
 
0.4%
13 29
 
0.3%
9 28
 
0.3%
Other values (1105) 2396
27.6%
ValueCountFrequency (%)
0 5795
66.7%
1 153
 
1.8%
2 80
 
0.9%
3 59
 
0.7%
4 45
 
0.5%
5 38
 
0.4%
6 34
 
0.4%
7 36
 
0.4%
8 28
 
0.3%
9 28
 
0.3%
ValueCountFrequency (%)
23492 1
< 0.1%
12253 1
< 0.1%
10705 1
< 0.1%
10424 1
< 0.1%
9058 1
< 0.1%
7810 1
< 0.1%
7185 1
< 0.1%
7148 1
< 0.1%
7104 1
< 0.1%
6805 1
< 0.1%

Spa
Real number (ℝ)

ZEROS 

Distinct1327
Distinct (%)15.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean304.58886
Minimum0
Maximum22408
Zeros5507
Zeros (%)63.3%
Negative0
Negative (%)0.0%
Memory size393.9 KiB
2024-03-05T16:33:44.179319image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q353
95-th percentile1575.2
Maximum22408
Range22408
Interquartile range (IQR)53

Descriptive statistics

Standard deviation1125.5626
Coefficient of variation (CV)3.6953503
Kurtosis82.920686
Mean304.58886
Median Absolute Deviation (MAD)0
Skewness7.7164496
Sum2647791
Variance1266891.1
MonotonicityNot monotonic
2024-03-05T16:33:44.247524image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5507
63.3%
1 146
 
1.7%
2 105
 
1.2%
5 53
 
0.6%
3 53
 
0.6%
4 46
 
0.5%
7 34
 
0.4%
6 33
 
0.4%
9 28
 
0.3%
8 28
 
0.3%
Other values (1317) 2660
30.6%
ValueCountFrequency (%)
0 5507
63.3%
1 146
 
1.7%
2 105
 
1.2%
3 53
 
0.6%
4 46
 
0.5%
5 53
 
0.6%
6 33
 
0.4%
7 34
 
0.4%
8 28
 
0.3%
9 28
 
0.3%
ValueCountFrequency (%)
22408 1
< 0.1%
18572 1
< 0.1%
16594 1
< 0.1%
16139 1
< 0.1%
15586 1
< 0.1%
15331 1
< 0.1%
15238 1
< 0.1%
14970 1
< 0.1%
13995 1
< 0.1%
13902 1
< 0.1%

VRDeck
Real number (ℝ)

ZEROS 

Distinct1306
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean298.26182
Minimum0
Maximum24133
Zeros5683
Zeros (%)65.4%
Negative0
Negative (%)0.0%
Memory size393.9 KiB
2024-03-05T16:33:44.329716image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q340
95-th percentile1480.2
Maximum24133
Range24133
Interquartile range (IQR)40

Descriptive statistics

Standard deviation1134.1264
Coefficient of variation (CV)3.8024525
Kurtosis87.883437
Mean298.26182
Median Absolute Deviation (MAD)0
Skewness7.9045544
Sum2592790
Variance1286242.7
MonotonicityNot monotonic
2024-03-05T16:33:44.413296image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5683
65.4%
1 139
 
1.6%
2 70
 
0.8%
3 56
 
0.6%
5 51
 
0.6%
4 47
 
0.5%
6 32
 
0.4%
8 30
 
0.3%
7 29
 
0.3%
9 25
 
0.3%
Other values (1296) 2531
29.1%
ValueCountFrequency (%)
0 5683
65.4%
1 139
 
1.6%
2 70
 
0.8%
3 56
 
0.6%
4 47
 
0.5%
5 51
 
0.6%
6 32
 
0.4%
7 29
 
0.3%
8 30
 
0.3%
9 25
 
0.3%
ValueCountFrequency (%)
24133 1
< 0.1%
20336 1
< 0.1%
17306 1
< 0.1%
17074 1
< 0.1%
16337 1
< 0.1%
14485 1
< 0.1%
12708 1
< 0.1%
12685 1
< 0.1%
12682 1
< 0.1%
12424 1
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size334.4 KiB
True
4378 
False
4315 
ValueCountFrequency (%)
True 4378
50.4%
False 4315
49.6%
2024-03-05T16:33:44.480179image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Passenger_Group
Real number (ℝ)

Distinct6217
Distinct (%)71.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4633.3896
Minimum1
Maximum9280
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size393.9 KiB
2024-03-05T16:33:44.548038image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile465.6
Q12319
median4630
Q36883
95-th percentile8819.4
Maximum9280
Range9279
Interquartile range (IQR)4564

Descriptive statistics

Standard deviation2671.0289
Coefficient of variation (CV)0.57647404
Kurtosis-1.1817463
Mean4633.3896
Median Absolute Deviation (MAD)2277
Skewness0.0020202219
Sum40278056
Variance7134395.1
MonotonicityIncreasing
2024-03-05T16:33:44.637363image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4498 8
 
0.1%
8168 8
 
0.1%
8728 8
 
0.1%
8796 8
 
0.1%
8956 8
 
0.1%
4256 8
 
0.1%
984 8
 
0.1%
9081 8
 
0.1%
8988 8
 
0.1%
5756 8
 
0.1%
Other values (6207) 8613
99.1%
ValueCountFrequency (%)
1 1
 
< 0.1%
2 1
 
< 0.1%
3 2
< 0.1%
4 1
 
< 0.1%
5 1
 
< 0.1%
6 2
< 0.1%
7 1
 
< 0.1%
8 3
< 0.1%
9 1
 
< 0.1%
10 1
 
< 0.1%
ValueCountFrequency (%)
9280 2
< 0.1%
9279 1
 
< 0.1%
9278 1
 
< 0.1%
9276 1
 
< 0.1%
9275 3
< 0.1%
9274 1
 
< 0.1%
9272 2
< 0.1%
9270 1
 
< 0.1%
9268 1
 
< 0.1%
9267 2
< 0.1%

Cabin_Deck
Categorical

MISSING 

Distinct8
Distinct (%)0.1%
Missing199
Missing (%)2.3%
Memory size393.9 KiB
F
2794 
G
2559 
E
876 
B
779 
C
747 
Other values (3)
739 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8494
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowF
3rd rowA
4th rowA
5th rowF

Common Values

ValueCountFrequency (%)
F 2794
32.1%
G 2559
29.4%
E 876
 
10.1%
B 779
 
9.0%
C 747
 
8.6%
D 478
 
5.5%
A 256
 
2.9%
T 5
 
0.1%
(Missing) 199
 
2.3%

Length

2024-03-05T16:33:44.715636image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-05T16:33:44.783006image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
f 2794
32.9%
g 2559
30.1%
e 876
 
10.3%
b 779
 
9.2%
c 747
 
8.8%
d 478
 
5.6%
a 256
 
3.0%
t 5
 
0.1%

Most occurring characters

ValueCountFrequency (%)
F 2794
32.9%
G 2559
30.1%
E 876
 
10.3%
B 779
 
9.2%
C 747
 
8.8%
D 478
 
5.6%
A 256
 
3.0%
T 5
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8494
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 2794
32.9%
G 2559
30.1%
E 876
 
10.3%
B 779
 
9.2%
C 747
 
8.8%
D 478
 
5.6%
A 256
 
3.0%
T 5
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 8494
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 2794
32.9%
G 2559
30.1%
E 876
 
10.3%
B 779
 
9.2%
C 747
 
8.8%
D 478
 
5.6%
A 256
 
3.0%
T 5
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8494
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 2794
32.9%
G 2559
30.1%
E 876
 
10.3%
B 779
 
9.2%
C 747
 
8.8%
D 478
 
5.6%
A 256
 
3.0%
T 5
 
0.1%

Cabin_Number
Real number (ℝ)

MISSING 

Distinct1817
Distinct (%)21.4%
Missing199
Missing (%)2.3%
Infinite0
Infinite (%)0.0%
Mean600.36767
Minimum0
Maximum1894
Zeros18
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size393.9 KiB
2024-03-05T16:33:44.859992image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile31
Q1167.25
median427
Q3999
95-th percentile1569.35
Maximum1894
Range1894
Interquartile range (IQR)831.75

Descriptive statistics

Standard deviation511.86723
Coefficient of variation (CV)0.85258959
Kurtosis-0.71277235
Mean600.36767
Median Absolute Deviation (MAD)329
Skewness0.71835962
Sum5099523
Variance262008.06
MonotonicityNot monotonic
2024-03-05T16:33:44.928877image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
82 28
 
0.3%
86 22
 
0.3%
19 22
 
0.3%
56 21
 
0.2%
176 21
 
0.2%
97 21
 
0.2%
230 20
 
0.2%
269 19
 
0.2%
65 19
 
0.2%
123 19
 
0.2%
Other values (1807) 8282
95.3%
(Missing) 199
 
2.3%
ValueCountFrequency (%)
0 18
0.2%
1 15
0.2%
2 11
0.1%
3 16
0.2%
4 7
 
0.1%
5 13
0.1%
6 12
0.1%
7 9
0.1%
8 13
0.1%
9 16
0.2%
ValueCountFrequency (%)
1894 1
< 0.1%
1893 1
< 0.1%
1892 1
< 0.1%
1891 1
< 0.1%
1888 2
< 0.1%
1886 1
< 0.1%
1884 1
< 0.1%
1880 1
< 0.1%
1878 1
< 0.1%
1877 1
< 0.1%

Cabin_Side
Categorical

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing199
Missing (%)2.3%
Memory size393.9 KiB
S
4288 
P
4206 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8494
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowP
2nd rowS
3rd rowS
4th rowS
5th rowS

Common Values

ValueCountFrequency (%)
S 4288
49.3%
P 4206
48.4%
(Missing) 199
 
2.3%

Length

2024-03-05T16:33:45.005456image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-05T16:33:45.059001image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
s 4288
50.5%
p 4206
49.5%

Most occurring characters

ValueCountFrequency (%)
S 4288
50.5%
P 4206
49.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8494
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 4288
50.5%
P 4206
49.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 8494
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 4288
50.5%
P 4206
49.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8494
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 4288
50.5%
P 4206
49.5%

Total_Spend
Real number (ℝ)

ZEROS 

Distinct2336
Distinct (%)26.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1440.8663
Minimum0
Maximum35987
Zeros3653
Zeros (%)42.0%
Negative0
Negative (%)0.0%
Memory size393.9 KiB
2024-03-05T16:33:45.124015image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median716
Q31441
95-th percentile6457.6
Maximum35987
Range35987
Interquartile range (IQR)1441

Descriptive statistics

Standard deviation2803.0457
Coefficient of variation (CV)1.9453891
Kurtosis27.478447
Mean1440.8663
Median Absolute Deviation (MAD)716
Skewness4.4175882
Sum12525451
Variance7857065.2
MonotonicityNot monotonic
2024-03-05T16:33:45.308668image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3653
42.0%
809 54
 
0.6%
788 40
 
0.5%
804 39
 
0.4%
803 34
 
0.4%
907 32
 
0.4%
908 32
 
0.4%
791 30
 
0.3%
888 29
 
0.3%
716 27
 
0.3%
Other values (2326) 4723
54.3%
ValueCountFrequency (%)
0 3653
42.0%
1 2
 
< 0.1%
2 1
 
< 0.1%
4 1
 
< 0.1%
8 1
 
< 0.1%
10 3
 
< 0.1%
11 2
 
< 0.1%
17 1
 
< 0.1%
21 1
 
< 0.1%
33 1
 
< 0.1%
ValueCountFrequency (%)
35987 1
< 0.1%
31076 1
< 0.1%
31074 1
< 0.1%
30478 1
< 0.1%
29608 1
< 0.1%
28074 1
< 0.1%
27848 1
< 0.1%
27842 1
< 0.1%
27650 1
< 0.1%
27428 1
< 0.1%

Interactions

2024-03-05T16:33:20.147161image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:10.332970image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:33.217728image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:54.828871image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:17.864810image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:39.751379image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:04.622067image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:32.634515image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:32:33.432402image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:33:20.205729image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:10.404570image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:33.277568image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:54.893577image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:17.922390image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:39.808120image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:04.706947image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:44.046216image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:32:35.487983image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:33:20.264404image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:10.462522image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:33.329400image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:54.945896image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:17.962619image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:39.866717image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:04.780867image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:55.013762image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:32:38.224796image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:33:20.317401image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:10.512416image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:33.379240image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:55.008440image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:18.030325image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:39.912538image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:04.879788image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:31:06.257483image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:32:40.951729image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:33:20.368475image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:10.562326image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:33.446123image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:55.045986image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:18.078344image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:39.968014image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:04.963264image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:31:16.392563image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:32:43.666306image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:33:20.418534image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:10.622095image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:33.496110image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:55.113849image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:18.129041image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:40.012668image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:05.022653image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:31:26.541029image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:32:46.619005image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:33:20.478868image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:10.685796image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:33.545908image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:55.174578image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:18.179488image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:40.062935image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:05.097697image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:31:37.746979image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:32:49.351013image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:33:38.845829image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:29.586117image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:51.505027image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:14.192982image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:36.085114image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:59.646022image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:28.312059image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:32:10.809502image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:33:12.171501image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:33:42.567732image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:33.157585image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:28:54.775766image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:17.793563image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:29:39.685575image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:04.539563image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:30:32.556796image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:32:23.314115image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-05T16:33:17.920467image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-03-05T16:33:42.802680image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-05T16:33:42.934186image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

HomePlanetCryoSleepDestinationAgeVIPRoomServiceFoodCourtShoppingMallSpaVRDeckTransportedPassenger_GroupCabin_DeckCabin_NumberCabin_SideTotal_Spend
PassengerId
0001_01EuropaFalseTRAPPIST-1e39.0False0.00.00.00.00.0False0001B0P0.0
0002_01EarthFalseTRAPPIST-1e24.0False109.09.025.0549.044.0True0002F0S736.0
0003_01EuropaFalseTRAPPIST-1e58.0True43.03576.00.06715.049.0False0003A0S10383.0
0003_02EuropaFalseTRAPPIST-1e33.0False0.01283.0371.03329.0193.0False0003A0S5176.0
0004_01EarthFalseTRAPPIST-1e16.0False303.070.0151.0565.02.0True0004F1S1091.0
0005_01EarthFalsePSO J318.5-2244.0False0.0483.00.0291.00.0True0005F0P774.0
0006_01EarthFalseTRAPPIST-1e26.0False42.01539.03.00.00.0True0006F2S1584.0
0006_02EarthTrueTRAPPIST-1e28.0False0.00.00.00.00.0True0006G0S0.0
0007_01EarthFalseTRAPPIST-1e35.0False0.0785.017.0216.00.0True0007F3S1018.0
0008_01EuropaTrue55 Cancri e14.0False0.00.00.00.00.0True0008B1P0.0
HomePlanetCryoSleepDestinationAgeVIPRoomServiceFoodCourtShoppingMallSpaVRDeckTransportedPassenger_GroupCabin_DeckCabin_NumberCabin_SideTotal_Spend
PassengerId
9272_02EarthFalseTRAPPIST-1e21.0False86.03.0149.0208.0329.0False9272F1894P775.0
9274_01NaNTrueTRAPPIST-1e23.0False0.00.00.00.00.0True9274G1508P0.0
9275_01EuropaFalseTRAPPIST-1e0.0False0.00.00.00.00.0True9275A97P0.0
9275_02EuropaFalseTRAPPIST-1e32.0False1.01146.00.050.034.0False9275A97P1231.0
9275_03EuropaNaNTRAPPIST-1e30.0False0.03208.00.02.0330.0True9275A97P3540.0
9276_01EuropaFalse55 Cancri e41.0True0.06819.00.01643.074.0False9276A98P8536.0
9278_01EarthTruePSO J318.5-2218.0False0.00.00.00.00.0False9278G1499S0.0
9279_01EarthFalseTRAPPIST-1e26.0False0.00.01872.01.00.0True9279G1500S1873.0
9280_01EuropaFalse55 Cancri e32.0False0.01049.00.0353.03235.0False9280E608S4637.0
9280_02EuropaFalseTRAPPIST-1e44.0False126.04688.00.00.012.0True9280E608S4826.0

Duplicate rows

Most frequently occurring

HomePlanetCryoSleepDestinationAgeVIPRoomServiceFoodCourtShoppingMallSpaVRDeckTransportedPassenger_GroupCabin_DeckCabin_NumberCabin_SideTotal_Spend# duplicates
0EarthFalse55 Cancri e0.0False0.00.00.00.00.0True3476G571P0.02
1EarthTrueTRAPPIST-1e0.0False0.00.00.00.00.0False6020G974P0.02
2EarthTrueTRAPPIST-1e0.0False0.00.00.00.00.0True3519G577P0.02
3EarthTrueTRAPPIST-1e2.0False0.00.00.00.00.0True4474G730S0.02
4EuropaTrue55 Cancri e18.0False0.00.00.00.00.0True0504B19S0.02
5EuropaTrue55 Cancri e30.0False0.00.00.00.00.0True0642C25S0.02
6EuropaTrueTRAPPIST-1e28.0False0.00.00.00.00.0True3279C123S0.02
7MarsFalseTRAPPIST-1e1.0False0.00.00.00.00.0True8681F1787P0.02
8MarsFalseTRAPPIST-1e4.0False0.00.00.00.00.0True5142F1050P0.02
9MarsTrue55 Cancri e20.0False0.00.00.00.00.0True2234F448P0.02